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CROSS-REFERENCES TO RELATED APPLICATIONS 



[0001 1 The present application claims priority to U,S. Provisional Patent Application No. 
60/405,589, filed August 14, 2002, the disclosure of which is incorporated herein in its 
entirety for all purposes. 

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY 



[0002] The present invention was supported by a grant from the National Institutes of 
7^331 

Health (CA ?603i). The Government may have rights in this invention. 



[0003] Protein post-translational modification is one of the dominant mechanisms of 
information transfer in cells, A major goal of current proteomic efforts is to generate a 
system level map describing all the sites of protein post-translational modification. Recent 
effort toward this goal has focused on developing new technologies for enriching and 
quantitating phosphopeptides. By contrast, identification of the sites of phosphorylation 
typically relies exclusively on the use of tandem mass spectrometry to sequCTce individual 
peptides. 

[0004] Much of the complexity of higher organisms is believed to reside in the specific 
post-translational modification of proteins (Venter et al.. Science, 2001, 291(5507): 1304- 
51.). Protein phosphorylation is the most ubiquitous such modification; almost 2% of tiie 
human genome encodes protein kinases and an estimated one-third of all proteins contain a 
covalently bound phosphate group (Manning et al. Science, 2002, 298(5600): 1912-34). 
Due to the importance of protein phosphorylation in regulating cellular signaling events, 
there is intense interest in developing technologies for moping phosphorylation events on a 
proteome-wide scale. 

[0005] Existing approaches for phosphorylation site mapping rely almost exclusively on 
ttie use of tandem mass spectrometry (MS/MS) to sequence individual peptides in order to 
localize sites of phosphorylation. Despite tiie power of this approach, MS/MS of 
phosphopeptides remains challenging due to (i) the signal suppression of phosphate 
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BACKGROUND OF THE INVENTION 




BRIEF SUMMARY OF THE INVENTION 
[0008] The present invention provides novel ond e nuoloaooo for use in mapping post- 
translational modification sites in a genome, such as the human genome. The present 
invention provides ondonueleojca that, surprisingly, site-specifically cleave a post- 
5 translationally modified polypeptide at a site of post-traiislational modification. 

[0009] In a first aspect, the invention provides a method of mapping the sites of 
polypeptide post-translational modifications. The method includes site-specifically cleaving 
a peptide bond of the post-translationally modified polypeptide with an endopeptidase at a 
site of post-translational modification to produce a degraded post-translationally modified 
10 polypeptide. After cleavage at the site of post-translational modification, the site of post- 
translational modification is detemiined. 

[0010] In another aspect; the present invention provides an endopeptidase that site- 
specifically cleaves a peptide bond of a post-translationally modified polypeptide at a site of 
post-translational modification, wherein the endopeptidase comprises an active site that binds 
15 to said post-translational modification. 

[001 1 ] In another aspect, the endopeptidases of the present invention are produced by a 
method that includes introducing one or more point mutations into a model endopeptidase at 
one or more candidate amino acid positions in an active site of the model endopq)tidase to 
produce a plurality of candidate endopeptidases. At least one of the plurality of the candidate 

20 endopeptidases is an endopeptidases of the present invention fliat site-specifically cleaves a 
p^tide bond of a post-translationally modified polypq)tide at a site of post-translational 
modification. The endopq)tidase that site-specifically cleaves at said site of post- 
translational modification is identified by contacting each of the plurality of candidate 
endopeptidases with the post-translationally modified polypeptide to determine whether or 

25 not each candidate endopeptidase site-specifically cleaves the peptide bond of the 
polypeptide at the site of a post-translational modification. 

[0012] In another aspect, the present invention provides an isolated nucleic acid encoding a 
endopeptidase which site-specifically cleaves a peptide bond of a post-tianslationaliy 
modified polypeptide at a site of post-translational modification and which comprises one or 
30 more point mutations at one or more amino acid positions within the endopeptidase active 
site. The isolated nucleic acid contains a subsequence having at least 70% nucleic acid 
sequence idmtity to a nucleic acid sequence of Figure 2. 
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I0013J In another aspect, the present invention provides an isolated nucleic acid encoding a 
endopeptidase which site-specifically cleaves a peptide bond of a post-translationally 
modified polypeptide at a site of post-translational modification and which comprises one or 
more point mutations at one or more amino acid positions within the endopeptidase active 
5 site. The isolated nucleic acid hybridizes under highly stringent hybridization conditions to a 
nucleic acid sequence of Figure 2, wherein the hybridization reaction is incubated at AT'C in 
a solution comprising 50% formamide, 5x SSC and 1% SDS, and washed at 65°C in a 
solution comprising 0.2x SSC and 0.1% SDS. 

BRffiF DESCRIPTION OF THE DRAWINGS 
1 0 [0014] Figure 1 is an amino acid sequence of a subtilisin model endopeptidas^^^^*^^*" *^ 

[0015] Figure 2 is a nucleic acid sequence that encodes a subtilisin model endopeptidas^^'^ 

[0016] Figure 3 illustrates a comparison of a computer generated three-dimensional 
stmcture of the model subtilisin and a phosphotyrosine polypeptide. 

[0017] Figure 4 illustrates the phosphotyrosine site-specificity of candidate subtiUsin 
15 endopeptidases and the model subtilisin endopeptidase against either an unmodified tyrosine 
or phenylalanine. 

[0018] Figure 5 shows kinetic data for the site-specific cleavage at a phosphotyrosine by a 
subtilism endopqjtidase containing the substitution point mutations P129G and E156R./^^''*'^- 

[0019] Figm-e 6 shows kinetic data for the site-specific cleavage at a phosphotyrosine by a 
20 subtilisin endopq>tidase containing the substitution point mutations G127S and E156R. 

[0020] Figure 7 is an amino acid sequence of a subtilisin model endopeptidase containing a 
signal sequence (in bold) and a pro-domain (underlined).^'^*" ^ ''^'^^ 

[0021] Figure 8 is a nucleic acid sequence that encodes a subtilisin model endopeptidase 
containing a signal sequence (in bold) and a pro-domain (underlined^^^^"" """"^^ 

25 DETAILED DESCRIPTION OF THE INVENTION 

[0022] In contrast to presently utilized methods of developing a system level map 
describing all the sites of post-translational peptide modification, e.g., polypeptide 
phosphorylation, the present invention provides an approach for post-translational 
modification mapping that makes it possible to enzymatically interrogate a protein sequence 

30 directly to identify sites of post-translational modification. 
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. [0045] '•Polypqjtidc*' refers to a polymer in which the monomers arc amino acids and are 
joined together through amide bonds, alternatively referred to as a "peptide/* The terms 
"peptide" and "polypeptide" encompass proteins. Unnatural amino acids, for example, 
alanine, phenylglycine and homoarginine are also included under this definition. Amino 
5 acids that are not gene-encoded may also be used in the present invention. Furthermore, 
amino acids that have been modified to include reactive groups may also be used in the 
invention. AH of the amino acids used in the present invention may be either the D - or L - 
isomer. The L -isomers are generally preferred. In addition, other peptidomimetics are also 
usefiil in the present invention. For a general review, see, Spatola, A. F., in Chemistry and 
10 Biochemistry of Amrsto Acids, Peptides and Proteins, B. Weinstein, eds.. Marcel 
Dekker, New York, p. 267 (1983). 

[0046] A "degraded post-translationally modified polypeptide" refers to the polypeptide 
fi:agments produced by site-specifically cleaving a post-translationally modified polypeptide 
at a site of post-translational modification using an ondonuoloaoe of the present invention. 

15 (00471 The term "fi-agmentation pattern" refers to the configuration of the polypeptide 
fiagments of the degraded post-translationally modified polypeptide as visualized or 
produced by an analytical method, A variety of analytical methods may be used to provide a 
fi:agmentation pattern. For example, where the analytical method is mass spectrometry, the 
firagmentation pattern is referred to as a "mass spectral fi^gmentation pattern." Where the 

20 analytical method is two-dimensional electrophoresis, the firagmentation pattem is referred to 
as a "two-dim^ional electrophoretic fi:agmentation pattem." 

[0048] The term **amino acid" refers to naturally occurring and synthetic amino acids, as 
well as amino acid analogs and amino acid mimetics that fimction in a manner similar to the 
naturally occurring amino acids. Naturally occurring amino acids are those encoded by the 

25 genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 

carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g.^ homosCTine^ 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified 

30 R groups (e.^., norleucine) or modified peptide backbones, but retain the same basic chemical 
structure as a naturally occurring amino acid. "Amino acid mimetics" refers to chemical 
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video display and a keyboard, a modem, an ISDN temiinal adapter, an Ethernet port, a 
punched card reader, a magnetic strip reader, or other suitable I/O device. 

[0142] The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison post translationally modified 
polypeptide target; and (4) a program for alignment and comparison, typically with rank- 
ordering of comparison results on the basis of computed similarity values. 

Kits 

[0143] The present invention also provides a kit for practicing a method set forth herein. In 
an exemplary embodiment, the kit includes one or more component useful to practice the 
method of the invention and instructions for using that component to practice the method of 
the invention. 

[0144] In a preferred embodiment, the kit includes a container of an endopeptidase for the 
present invention and instructions for using the endopeptidase to determine sites of post- 
translationally modification on the polypq>tide. The examples that follow are intended to 
further illustrate the invention not to limit the scope of the invention. 

[0145] The terms and expressions which have been employed herein are used as terms of 
description and not of limitation, and there is no intention in the use of such terms and 
expressions of excluding equivalents of the features shown and described, or portions thereof, 
it being recognized that various modifications are possible wi&in the scope of the invention 
claimed. Moreover, any one or more features of any onbodiment of the invention may be 
combined with any one or more other features of any other embodiment of the invention, 
without departing fi-om the scope of the invention. For example, the ondonuoleaaes described 
m the OftQonuoloac€ section are equally applicable to the informatics methods described 
herein. All publications, patents, and patent applications cited herein are hereby incotporated 
by reference in their entirety for all putposes. 

EXAMPLES 

Materials 

(01461 The BG2036 protease deficient strain of A subtilis and the pSS5 shuttle vector 
containing the subtilisin BPN* gene were employed. All pNA tetrap^tide substrates were 
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10155] The resulting test polypeptides are shown in Figure 5, wherein Xxx represents a 
phosphotyrosine, sulfonyl tyrosine, tyrosine, phenylalanine, phosphoserine, 
phosphothreonine, alanine, valine, leucine, isoleucine, aspartic acid, glutamic acid, arginine, 
or lysine as shown. The data in panel A was obtained using a test polypeptide containing a 
5 succinyl-paranitroanalide fluorogenic donor-acceptor pair. The data in panel B was obtained 
using a test polypeptide containing am aminobenzoic acid- tyrosine(N02)-aspartic acid 
fluorogenic donor-acceptor pair. 

Example 4 

[0156] Example 4 demonstrates a method for identifying an endopeptidase that site- 
10 specifically cleaves a peptide bond of a post-translationally modified polypeptide. The 

methods involve assaying the candidate subtilisins of Example 2 with the test polypeptides of 
Example 3. 

[0157] Kinetics for the fluorogenic substrates of the series Abz-Phe-Arg-Pro-Xxx-Gly-Phe- 

(sea:D>tJo:5) ^ 

Y(N02)-Asp^were measured in 50 mM Bicine, 2 mM CaCh, pH 8.5 at 25''C by monitoring 
15 fluorescence at 420 nm upon excitation at 320 nm using a instrument. Initial rate data fi-om 8 
substrate concentrations bracketing the Km was measured in triplicate and fit directly to the 
Michaelis Menten equation using the Prism software package (GraphPad, ). When it was not 
possible to saturate the enzyme, values for kcat/KM were obtained fi-om initial rates at low 
concentrations (10[S]<Km) using the relationship kcat/KM = Vo[S]. Kinetics for tetrapeptide 
20 substrates of the series Suc-Ala-Ala-Pro-Xxx-pNa were measured by monitoring the change 
in absorbance at 412 nm over time using a Uvikon spectrophotometer. Protein concentrations 
were determined spectrophotometrically using an extinction coefficient of 32.2 mM"^ cm** at 
280nm (Matsubara, 1965). 

Example 5 

25 [0158] Example 5 demonstrates that subtilisin endopeptidases that site-specifically cleave a 
phosphotyrosine polypeptide at the phosphorylated tyrosine are obtained using the methods 
of the present invention, as demonstrated in Examples 1-4. 

[0159] Figure 4 illustrates the phosphotyrosine site-specificity of the candidate subtilisin 
endopeptidases and the model subtilisin endopeptidase against either an unmodified tyrosine 
30 or phenylalanine. As shown in Figure 4, subtilisin endopeptidases containing the following 
substitution point mutations were found to preferably cleave at the phosphotyrosine residue 
over a tyrosine residue or phenylalanine residue: G127S and E156R, P129G and E156R, 
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_ PATENT 

AttomflBocket No.: 18062G-006600PC 
Cu^t Reference No.: SF2003-012 

PROTEOME-WIDE MAPPING OF POST-TRANSLATIONAL 

MODIFICATIONS USING ENDONUCLEAQEG 

ABSTRACT OF THE DISCLOSURE 
The present invention provides novel ondonucloagc fr that site-specifically 
cleave a post-translationally modified polypeptide at a site of post-translational modification. 
The present invention further provides methods making and using the ondonuclcasea , 
including methods of mapping post-translational modifications in the human genome. 
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